56 research outputs found
Hoaxy: A Platform for Tracking Online Misinformation
Massive amounts of misinformation have been observed to spread in
uncontrolled fashion across social media. Examples include rumors, hoaxes, fake
news, and conspiracy theories. At the same time, several journalistic
organizations devote significant efforts to high-quality fact checking of
online claims. The resulting information cascades contain instances of both
accurate and inaccurate information, unfold over multiple time scales, and
often reach audiences of considerable size. All these factors pose challenges
for the study of the social dynamics of online news sharing. Here we introduce
Hoaxy, a platform for the collection, detection, and analysis of online
misinformation and its related fact-checking efforts. We discuss the design of
the platform and present a preliminary analysis of a sample of public tweets
containing both fake news and fact checking. We find that, in the aggregate,
the sharing of fact-checking content typically lags that of misinformation by
10--20 hours. Moreover, fake news are dominated by very active users, while
fact checking is a more grass-roots activity. With the increasing risks
connected to massive online misinformation, social news observatories have the
potential to help researchers, journalists, and the general public understand
the dynamics of real and fake news sharing.Comment: 6 pages, 6 figures, submitted to Third Workshop on Social News On the
We
How algorithmic popularity bias hinders or promotes quality
Algorithms that favor popular items are used to help us select among many
choices, from engaging articles on a social media news feed to songs and books
that others have purchased, and from top-raked search engine results to
highly-cited scientific papers. The goal of these algorithms is to identify
high-quality items such as reliable news, beautiful movies, prestigious
information sources, and important discoveries --- in short, high-quality
content should rank at the top. Prior work has shown that choosing what is
popular may amplify random fluctuations and ultimately lead to sub-optimal
rankings. Nonetheless, it is often assumed that recommending what is popular
will help high-quality content "bubble up" in practice. Here we identify the
conditions in which popularity may be a viable proxy for quality content by
studying a simple model of cultural market endowed with an intrinsic notion of
quality. A parameter representing the cognitive cost of exploration controls
the critical trade-off between quality and popularity. We find a regime of
intermediate exploration cost where an optimal balance exists, such that
choosing what is popular actually promotes high-quality items to the top.
Outside of these limits, however, popularity bias is more likely to hinder
quality. These findings clarify the effects of algorithmic popularity bias on
quality outcomes, and may inform the design of more principled mechanisms for
techno-social cultural markets
Finding Streams in Knowledge Graphs to Support Fact Checking
The volume and velocity of information that gets generated online limits
current journalistic practices to fact-check claims at the same rate.
Computational approaches for fact checking may be the key to help mitigate the
risks of massive misinformation spread. Such approaches can be designed to not
only be scalable and effective at assessing veracity of dubious claims, but
also to boost a human fact checker's productivity by surfacing relevant facts
and patterns to aid their analysis. To this end, we present a novel,
unsupervised network-flow based approach to determine the truthfulness of a
statement of fact expressed in the form of a (subject, predicate, object)
triple. We view a knowledge graph of background information about real-world
entities as a flow network, and knowledge as a fluid, abstract commodity. We
show that computational fact checking of such a triple then amounts to finding
a "knowledge stream" that emanates from the subject node and flows toward the
object node through paths connecting them. Evaluation on a range of real-world
and hand-crafted datasets of facts related to entertainment, business, sports,
geography and more reveals that this network-flow model can be very effective
in discerning true statements from false ones, outperforming existing
algorithms on many test cases. Moreover, the model is expressive in its ability
to automatically discover several useful path patterns and surface relevant
facts that may help a human fact checker corroborate or refute a claim.Comment: Extended version of the paper in proceedings of ICDM 201
Factuality Challenges in the Era of Large Language Models
The emergence of tools based on Large Language Models (LLMs), such as
OpenAI's ChatGPT, Microsoft's Bing Chat, and Google's Bard, has garnered
immense public attention. These incredibly useful, natural-sounding tools mark
significant advances in natural language generation, yet they exhibit a
propensity to generate false, erroneous, or misleading content -- commonly
referred to as "hallucinations." Moreover, LLMs can be exploited for malicious
applications, such as generating false but credible-sounding content and
profiles at scale. This poses a significant challenge to society in terms of
the potential deception of users and the increasing dissemination of inaccurate
information. In light of these risks, we explore the kinds of technological
innovations, regulatory reforms, and AI literacy initiatives needed from
fact-checkers, news organizations, and the broader research and policy
communities. By identifying the risks, the imminent threats, and some viable
solutions, we seek to shed light on navigating various aspects of veracity in
the era of generative AI.Comment: Our article offers a comprehensive examination of the challenges and
risks associated with Large Language Models (LLMs), focusing on their
potential impact on the veracity of information in today's digital landscap
Computational fact checking from knowledge networks
Traditional fact checking by expert journalists cannot keep up with the
enormous volume of information that is now generated online. Computational fact
checking may significantly enhance our ability to evaluate the veracity of
dubious information. Here we show that the complexities of human fact checking
can be approximated quite well by finding the shortest path between concept
nodes under properly defined semantic proximity metrics on knowledge graphs.
Framed as a network problem this approach is feasible with efficient
computational techniques. We evaluate this approach by examining tens of
thousands of claims related to history, entertainment, geography, and
biographical information using a public knowledge graph extracted from
Wikipedia. Statements independently known to be true consistently receive
higher support via our method than do false ones. These findings represent a
significant step toward scalable computational fact-checking methods that may
one day mitigate the spread of harmful misinformation
Colorectal cancer after bariatric surgery (Cric-Abs 2020): Sicob (Italian society of obesity surgery) endorsed national survey
Background The published colorectal cancer (CRC) outcomes after bariatric surgery (BS) are conflicting, with some anecdotal studies reporting increased risks. The present nationwide survey CRIC-ABS 2020 (Colo-Rectal Cancer Incidence-After Bariatric Surgery-2020), endorsed by the Italian Society of Obesity Surgery (SICOB), aims to report its incidence in Italy after BS, comparing the two commonest laparoscopic procedures-Sleeve Gastrectomy (SG) and Roux-en-Y gastric bypass (GBP). Methods Two online questionnaires-first having 11 questions on SG/GBP frequency with a follow-up of 5-10 years, and the second containing 15 questions on CRC incidence and management, were administered to 53 referral bariatric, high volume centers. A standardized incidence ratio (SIR-a ratio of the observed number of cases to the expected number) with 95% confidence intervals (CI) was calculated along with CRC incidence risk computation for baseline characteristics. Results Data for 20,571 patients from 34 (63%) centers between 2010 and 2015 were collected, of which 14,431 had SG (70%) and 6140 GBP (30%). 22 patients (0.10%, mean age = 53 +/- 12 years, 13 males), SG: 12 and GBP: 10, developed CRC after 4.3 +/- 2.3 years. Overall incidence was higher among males for both groups (SG: 0.15% vs 0.05%; GBP: 0.35% vs 0.09%) and the GBP cohort having slightly older patients. The right colon was most affected (n = 13) and SIR categorized/sex had fewer values < 1, except for GBP males (SIR = 1.07). Conclusion Low CRC incidence after BS at 10 years (0.10%), and no difference between procedures was seen, suggesting that BS does not trigger the neoplasm development
- …